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Abstract 

Behavioral software contracts are a widely used mechanism for 
governing the flow of values between components. However, run- 
time monitoring and enforcement of contracts imposes significant 
overhead and delays discovery of faulty components to run-time. 

To overcome these issues, we present soft contract verification, 
which aims to statically prove either complete or partial contract 
correctness of components, written in an untyped, higher-order lan- 
guage with first-class contracts. Our approach uses higher-order 
symbolic execution, leveraging contracts as a source of symbolic 
values including unknown behavioral values, and employs an up- 
datable heap of contract invariants to reason about flow-sensitive 
facts. We prove the symbolic execution soundly approximates the 
dynamic semantics and that verified programs can 't be blamed. 

The approach is able to analyze first-class contracts, recursive 
data structures, unknown functions, and control-flow-sensitive re- 
finements of values, which are all idiomatic in dynamic languages. 
It makes effective use of an off-the-shelf solver to decide problems 
without heavy encodings. The approach is competitive with a wide 
range of existing tools — including type systems, flow analyzers, 
and model checkers — on their own benchmarks. 

Categories and Subject Descriptors D.2.4 [Software Engineer- 
ing]: Software/Program Verification; D.3.1 [Programming Lan- 
guages] : Formal Definitions and Theory 

Keywords Higher-order contracts; symbolic execution 

1. Static verification for dynamic languages 

Contracts (Meyer 1991; Findler and Felleisen 2002) have become 
a prominent mechanism for specifying and enforcing invariants in 
dynamic languages (Disney 2013; Plosch 1997; Austin et al. 201 1; 
Strickland et al. 2012; Hickey et al. 2013). They offer the expres- 
sivity and flexibility of programming in a dynamic language, while 
still giving strong guarantees about the interaction of components. 
However, there are two downsides: (1) contract monitoring is ex- 
pensive, often prohibitively so, which causes programmers to write 
more lax specifications, compromising correctness for efficiency; 
and (2) contract violations are found only at run-time, which delays 
discovery of faulty components with the usual negative engineering 
consequences. 
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Static verification of contracts would empower programmers to 
state stronger properties, get immediate feedback on the correct- 
ness of their software, and avoid worries about run-time enforce- 
ment cost since, once verified, contracts could be removed. All- 
or-nothing approaches to verification of typed functional programs 
has seen significant advances in the recent work on static contract 
checking (Xu et al. 2009; Xu 2012; Vytiniotis et al. 2013), refine- 
ment type checking (Terauchi 2010; Zhu and Jagannathan 2013; 
Vazou et al. 2013, 2014), and model checking (Kobayashi 2009b; 
Kobayashi et al. 2010, 201 1). However, the highly dynamic nature 
of untyped languages makes verification more difficult. 

Programs in dynamic languages are often written in idioms that 
thwart even simple verification methods such as type inference. 
Moreover, contracts themselves are written within the host lan- 
guage in the same idiomatic style. This suggests that moving be- 
yond all-or-nothing approaches to verification is necessary. 

In previous work (Tobin-Hochstadt and Van Horn 2012), we 
proposed an approach to soft contract verification, which enables 
piecemeal and modular verification of contracts. This approach 
augments a standard reduction semantics for a functional language 
with contracts and modules by endowing it with a notion of "un- 
known" values refined by sets of contracts. Verification is carried 
out by executing programs on abstract values. 

To demonstrate the essence of the idea, consider the following 
contrived, but illustrative example. Let pos? and neg? be predicates 
for positive and negative integers. Contracts can be arbitrary predi- 
cates, so these functions are also contracts. Consider the following 
contracted function (written in Lisp-like notation): 

(f : pos? -> neg?) ; contract 

(define (f x) (* x -1)) ; function 

We can verify this program by (symbolically) running it on an "un- 
known" input. Checking the domain contract refines the input to be 
an unknown satisfying the set of contracts {pos?}. By embedding 
some basic facts about pos?, neg?, and -1 into the reduction rela- 
tion for *, we conclude (* {pos?} -1) i — > {neg?}, and voila, 
we've shown once and for all f meets its contract obligations and 
cannot be blamed. We could therefore soundly eliminate any con- 
tract which blames f , in this case neg?. 

This approach is simple and effective for many programs, but 
suffers from several shortcomings, which we solve in this paper: 

Solver-aided reasoning: While embedding symbolic arithmetic 
knowledge for specific, known contracts works for simple exam- 
ples, it fails to reason about arithmetic generally. Contracts often fail 
to verify because equivalent formulations of contracts are not hard- 
coded in the semantics of primitives. Many systems address this 
issue by incorporating an SMT solver. However, for a higher-order 
language, solver integration is often achieved by reasoning in a the- 
ory of uninterpreted functions or semantic embeddings (Knowles 
and Flanagan 2010; Rondon et al. 2008; Vytiniotis et al. 2013). 

In this paper, we observe that higher-order contracts can be ef- 
fectively verified using only a simple first-order solver. The key in- 
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sight is that contracts delay higher-order checks and failures always 
occur with a first order witness. By relying on a (symbolic) seman- 
tic approach to carry out higher-order contract monitoring, we can 
use an SMT solver to reason about integers without the need for 
sophisticated encodings. (Examples in §2.3.) 

Flow sensitive reasoning: Just as our semantic approach de- 
composes higher-order contracts into first-order properties, first- 
order contracts naturally decompose into conditionals. Our prior 
approach fails to reason effectively about conditionals, requiring 
contract checks to be built-in to the semantics. As a result, even 
simple programs with conditionals fail to verify: 

(g : int? -> neg?) 

(define (g x) (if (pos? x) (f x) (f 8))) 

This is because the true-branch call to f is ( f {int?} ) by substitu- 
tion, although we know from the guard that x satisfies pos?. 

In this paper, we observe that flow-sensitivity can be achieved 
by replacing substitution with heap-allocated abstract values. 
These heap addresses are then refined as they flow through pred- 
icates and primitive operations, with no need for special handling 
of contracts (§2.2). As a result, the system is not only effective 
for contract verification, but can also handle safety verification for 
programs with no contracts at all. 

First-class contracts: Pragmatic contract systems enable first- 
class contracts so new combinators can be written as functions that 
consume and produce contracts. But to the best of our knowledge, 
no verification system currently supports first class contracts (or 
refinements), and in most approaches it appears fundamentally dif- 
ficult to incorporate such a notion. 

Because we handle contracts (and all other features) by execu- 
tion, first-class contracts pose no significant technical challenge and 
our system reasons about them effectively (§2.5). 

Converging for non-tail recursion: Of course, simply executing 
programs has a fundamental drawback — it will fail to terminate 
in many cases, and when the inputs are unknown, execution will 
almost always diverge. Our prior work used a simple loop detection 
algorithm that handled only tail-recursive functions. As a result, 
even simple programs operating over inductive data timed out. 

In this paper, we accelerate the convergence of programs by 
identifying and approximating regular accumulation of evaluation 
contexts, causing common recursive programs to converge on un- 
known values, while providing precise predictions (§2.4). As with 
the rest of our approach, this happens during execution and is there- 
fore robust to complex, higher-order control flow. 



Combining these techniques yields a system competitive with a 
diverse range of existing powerful static checkers, achieving many 
of their strengths in concert, while balancing the benefits of static 
contract verification with the flexibility of dynamic enforcement. 

We have built a prototype soft verification engine, which we dub 
SCV, based on these ideas and used it to evaluate the approach (§4). 
Our evaluation demonstrates that the approach can verify proper- 
ties typically reserved for approaches that rely on an underlying 
type system, while simultaneously accommodating the dynamism 
and idioms of untyped programming languages. We take exam- 
ples from work on soft typing (Cartwright and Fagan 1 99 1 ; Wright 
and Cartwright 1997), type systems for untyped languages (Tobin- 
Hochstadt and Felleisen 2010), static contract checking (Xu et al. 
2009; Xu 2012), refinement type checking (Terauchi 2010), and 
model checking of higher-order functional languages (Kobayashi 
2009b; Kobayashi et al. 2010, 201 1). 

S CV can prove all contract checks redundant for almost all of the 
examples taken from this broad array of existing program analysis 



and type checking work, and can handle many of the tricky higher- 
order verification problems demonstrated by other systems . In other 
words, our approach is competitive with type systems, model check- 
ers, and soft typing systems on each of their chosen benchmarks — 
in contrast, work on higher-order model checking does not handle 
benchmarks aimed at soft typing or occurrence typing, and vice 
versa. In the cases where SCV does not prove the complete ab- 
sence of contract errors, the vast majority of possible dynamic errors 
are ruled out, justifying numerous potential optimizations. Over this 
corpus of programs, 99% of the contract and run-time type checks 
are proved safe, and could be eliminated. 

We also evaluate the verification of three small interactive video 
games which use first-class and dependent contracts pervasively. 
The results show the subsequent elimination of contract monitoring 
has a dramatic effect: from a factor speed up of 7 in one case, to 
three orders of magnitude in the others. In essence, these results 
show the games are infeasible without contract verification. 

2. Worked examples 

We now present the main ideas of our approach through a series of 
examples taken from work on other verification techniques, starting 
from the simplest and working up to a complex object encoding. 

2.1 Higher-order symbolic reasoning 

Consider the following simple function that transforms functions 
on even integers into functions on odd integers. It has been ascribed 
this specification as a contract, which can be monitored at run-time. 

(e2o : (even? even?) (odd? - odd?)) 
(define (e2o f) 

(A (n) (- (f (+ n 1) ) 1))) 

A contract monitors the flow of values between components. In 
this case, the contract monitors the interaction between the context 
and the e2o function. It is easy to confirm that e2o is correct with 
respect to the contract; e2o holds up its end of the agreement, and 
therefore cannot be blamed for any run-time failures that may arise. 
The informal reasoning goes like this: First assume f is an even? 
-> even? function. When applied, we must ensure the argument is 
even (otherwise e2o is at fault), but may assume the result is even 
(otherwise the context is at fault). Next assume n is odd (otherwise 
the context is at fault) and ensure the result is odd (otherwise e2o 
is at fault). Since (+ n 1) is even when n is odd, f is applied to an 
even argument, producing an even result. Subtracting one therefore 
gives an odd result, as desired. 

This kind of reasoning mimics the step-by-step computation of 
e2o, but rather than considering some particular inputs, it considers 
these inputs symbolically to verify all possible executions of e2o. 
We systematize this kind of reasoning by augmenting a standard 
reduction semantics for contracts with symbolic values that are 
refined by sets of contracts. At first approximation, the semantics 
includes reductions such as: 

(+ {odd?} 1) i — > {even?}, and 

({even? - even?} {even?}) i — ► {even?}. 

This kind of symbolic reasoning mimics a programmer's infor- 
mal intuitions which employ contracts to refine unknown values and 
to verify components meet their specifications. If a component can- 
not be blamed in the symbolic semantics, we can safely conclude it 
cannot be blamed in general. 

2.2 Flow sensitive reasoning 

Programmers using untyped languages often use a mixture of type- 
based and flow-based reasoning to design programs. The analysis 
naturally takes advantage of type tests idiomatic in dynamic lan- 
guages even when the tests are buried in complex expressions. The 
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following function taken from work on occurrence typing (Tobin- 
Hochstadt and Felleisen 2010) can be proven safe using our sym- 
bolic semantics: 

(f : (or/c int? str?) cons? - int?) 
(define (f x p) 
(cond 

[(and (int? x) (int? (car p))) (+ x (car p))] 
[(int? (car p)) (+ (str-len x) (car p))] 
[else 0])) 

Here, int?, str?, and cons? are type predicates for integers, 
strings, and pairs, respectively. The contract (or/c int? str?) 
uses the or/c contract combinator to construct a contract specifying 
a value is either an integer or a string. 

A programmer would convince themselves this program was 
safe by using the control dominating predicates to refine the types 
of x and (car p) in each branch of the conditional. 1 Our symbolic 
semantics accommodates exactly this kind of reasoning in order to 
verify this example. However, there is a technical challenge here. A 
straightforward substitution-based semantics would not reflect the 
flow-sensitive facts. Focusing just on the first clause, a substitution 
model would give: 

(cond 

[(and (int? {(or/c int? str?)}) (int? (car {cons?}))) 
(+ {(or/c int? str?)} (car {cons?}))] ...) 

At this point, it's too late to communicate the refinement of these 
sets implied by the test evaluating to true, so the semantics would 
report the contract on + potentially being violated because the first 
argument may be a string, and the second argument may be any- 
thing. We overcome this challenge by modelling symbolic values as 
heap-allocated sets of contracts. When predicates and data structure 
accessors are applied to heap addresses, we refine the correspond- 
ing sets to reflect what must be true. So the program is modelled 
as: 

(cond 

[(and (int? Li) (int? (car L 2 ))) 
(+ Li (car L 2 ))] ...) 
where L\ h {(or/c int? string?)}, L2 <-¥ {cons?} 

In the course of evaluating the test, we get to (int? Li), the 
semantics conceptually forks the evaluator and refines the heap: 

(int? Li) 1 — > t rue, where L\ M> {int?} 

1 — > false, where L\ H» {string?} 

Similar refinements to L2 are communicated through the heap for 
(int? (car L2)), thereby making (+ L\ (car £2)) safe. 
This simple idea is effective in achieving flow-based refinements. It 
naturally handles deeply nested and inter-procedural conditionals. 

2.3 Incorporating an SMT solver 

The techniques described so far are highly effective for reasoning 
about functions and many kinds of recursive data structures. How- 
ever, effective reasoning about many kinds of base values, such as 
integers, requires sophisticated domain-specific knowledge. Rather 
than build such a tool ourselves, we defer to existing high-quality 
solvers for these domains. Unlike many solver-aided verification 
tools, however, we use the solver only for queries on base values, 
rather than attempting to encode a rich, higher-order language into 
one that is accepted by the solver. 



1 The call to str-len is safe because (and (int? x) (int? (car p))) 
being false and (int? (car p) ) being true implies that (int? x) is false, 
which in turns implies x is a string as enforced by f 's contract. 



To demonstrate our approach, we take an example (intro3) 
from work on model checking higher-order programs (Kobayashi 
etal.2010). 

(>/c : int? -> any -» bool?) 

(define (>/c lo) (A (x) (and (int? x) (> x lo)))) 

(define (f x g) (g (+ x 1))) 

(h : [x : int?] - [y : (>/c x)] - (>/c y)) 
(define (h x) ...) ; unknown definition 

(main : int? -. (>/c 0) ) 

(define (main n) (if (> n 0) (f n (h n)) 1)) 

In this program, we define a contract combinator (>/c) that 
creates a check for an integer from a lower bound; a helper function 
f , which comes without a contract; and an unknown function h that 
given an integer x, returns a function mapping some number y that is 
greater than x to an answer greater than y — here h's specification is 
given, but not its implementation. (Note h's contract is dependent.) 
We verify main's correctness, which means it definitely returns a 
positive integer and does not violate h's contract. 

According to its contract, main is passed an integer n. If n is neg- 
ative, main returns 1, satisfying the contract. Otherwise the function 
applies f to n and ( h n ) . Function h, by its contract, returns another 
function that requires a number greater than n. Examining f 's defi- 
nition, we see h (now bound to g) is eventually applied to ( + n 1 ) . 
Let n x be the result of (+ n 1) . And by h's contract, we know the 
answer is another integer greater than ( + n 1 ) . Let us name this 
answer n2. In order to verify that main satisfies contract (>/c 0), 
we need to verify that n2 is a positive integer. 

Once f returns, the heap contains several addresses with con- 
tracts: 

n 1 ^ {int?, (>/c 0)} 
ni 1 y {int?, (=/c (+ n 1))} 
n 2 1— > {int?, (>/c ni )} 
We then translate this information to a query for an external solver: 

n, m, n 2 : INT; 
ASSERT n > 0; 
ASSERT m = n + 1; 
ASSERT n 2 > ni; 
QUERY n 2 > 0; 

Solvers such as CVC4 (Barrett et al. 2011) and Z3 (De Moura 
and Bjorner 2008) easily verify this implication, proving main's 
correctness. 

Refinements such as (>/c 0) are generated by primitive ap- 
plications (> x 0), and queries are generated from translation of 
the heap, not arbitrary expressions. This has a few consequences. 
First, by the time we have value v satisfying predicate p on the heap, 
we know that p terminates successfully on v. Issues such as errors 
(from p itself) or divergence are handled elsewhere in other eval- 
uation branches. Second, we only need to translate a small set of 
simple, well understood contracts — not arbitrary expressions. Eval- 
uation naturally breaks down complex expressions, and properties 
are discovered even when they are buried in complex, higher-order 
functions. Given a translation for (>/c 0), the analysis automati- 
cally takes advantage of the solver even when the predicate contains 
> in a complex way, such as ( A (x) (or (> x 0) e) where e is an 
arbitrary expression. Predicates that lack translations to SMT only 
reduce precision, never soundness. 

2.4 Converging for non-tail recursion 

The techniques sketched above provide high precision in the ex- 
amples considered, but simply executing programs on abstract val- 
ues is unlikely to terminate in the presence of recursion. When an 
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abstract value stands for an infinite set of concrete values, execu- 
tion may proceed infinitely, building up an ever-growing evaluation 
context. To tackle this problem, we summarize this context to coa- 
lesce repeated structures and enable termination on many recursive 
programs. Although guaranteed termination is not our goal, the em- 
pirical results (§4) demonstrate that the method is effective in prac- 
tice. 

The following example program is taken from work on model 
checking of higher-order functional programs (Kobayashi et al. 
2010), and demonstrates checking non-trivial safety properties on 
recursive functions. Note that no loop invariants need be provided 
by the user. 

(main : (and/c int? >0?) -> (and/c int? >0?)) 
(define (main n) 

(let ([I (make-list n)]) 

(if (> n 0) (car (reverse I empty)) 0))) 

(define (reverse I ac) 
(if (empty? I) ac 

(reverse (cdr I) (cons (car I) ac)))) 

(define (make-list n) 
(if (= n 0) empty 

(cons n (make-list (- n 1)))))) 

Again, we aim to verify both the specified contract for main as 
well as the preconditions for primitive operations such as ca r. Most 
significantly, we need to verify that ( reverse I empty) produces 
a non-empty list (so that car succeeds) and that its first element is 
a positive integer. The local functions reverse and make-list do 
not come with a contract. 

This problem is more challenging than the original OCaml ver- 
sion of the same program, due to the lack of types. This program 
represents a common idiom in dynamic languages: not all values are 
contracted, and there is no type system on which to piggy-back ver- 
ification. In addition, programmers often rely on inter-procedural 
reasoning to justify their code's correctness, as here with reverse. 

We verify main by applying it to an abstract (unknown) value 
ni. The contract ensures that within the body, ni is a non-negative 
integer. 

The integer ni is first passed to make-list. The comparison 
(= ni 0) non-deterministically returns true and false, updating 
the information known about ni to be either 0 or (>/c 0) in each 
corresponding case. In the first case, make-list returns empty. In 
the second case, make-list proceeds to the recursive application 
(make-list n2>, where v\2 is the abstract non-negative integer 
obtained from evaluating (- ni 1). However, (make-list n2> 
is identical to the original call (make-list ni) up to renaming, 
since both m and n2 are non-negative. Therefore, we pause here 
and use a summary of make-list's result instead of continuing in 
an infinite loop. 

Since we already know that empty is one possible result of 
(make-list ni ), we use it as the result of (make-list n2).Theap- 
plication (make-list ru) therefore produces the pair (ni, empty), 
which is another answer for the original application. We could con- 
tinue this process and plug this new result into the pending appli- 
cation (make-list n2 ). But by observing that the application pro- 
duces a list of one positive integer when the recursive call produces 
empty, we approximate the new result (ni, empty) to a non-empty 
list of positive integers, and then use this approximate answer as 
the result of the pending application (make-list n2>. This then 
induces another result for (make-list m ), a list of two or more 
positive integers, but this is subsumed by the previous answer of 
non-empty integer list. We have now discovered all possible re- 
turn values of make-list when applied to a non-negative integer: 



it maps 0 to empty, and positive integers to a non-empty list of pos- 
itive integers. 

Although our explanation made use of the order, the soundness 
of analyzing make-list does not depend on the order of explor- 
ing non-deterministic branches. Each recursive application with re- 
peated arguments generates a waiting context, and each function 
return generates a new case to resume. There is an implicit work- 
list algorithm in the modified semantics (§3.8). 

When make -list returns to main, we have two separate cases: 
either m is 0 and I is empty, or ni is positive and I is non-empty. 
In the first case, (> ni 0) is false and main returns 0, satisfying 
the contract. Otherwise, main proceeds to reversing the list before 
taking its first element. 

Using the same mechanism as with make-list, the analysis in- 
fers that reverse returns a non-empty list when either of its argu- 
ments (I or acc) is non-empty. In addition, reverse only receives 
arguments of proper lists, so all partial operations on I such as ca r 
and cd r are safe when I is not empty, without needing an explicit 
check. The function eventually returns a non-empty list of integers 
to main, justifying main's call to the partial function car, produc- 
ing a positive integer. Thus, main never has a run-time error in any 
context. 

While this analysis makes use of the implementation of make- 
list and reverse, that does not imply that it is whole-program. 
Instead, it is modular in its use of unknown values abstracting ar- 
bitrary behavior. For example, make-list could instead be an ab- 
stract value represented by a contract that always produces lists of 
integers. The analysis would still succeed in proving all contracts 
safe except the use of ca r in main — this shows the flexibility avail- 
able in choosing between precision and modularity. In addition, the 
analysis does not have to be perfectly precise to be useful. If it suc- 
cessfully verifies most contracts in a module, that already greatly 
improves confidence about the module's correctness and justifies 
the elimination of numerous expensive dynamic checks. 

2.5 Putting it all together 

The following example illustrates all aspects of our system. For this, 
we choose a simple encoding of classes as functions that produce 
objects, where objects are again functions that respond to messages 
named by symbols. We then verify the correctness of a mixin: a 
function from classes to classes. The vec/c contract enforces the 
interface of a 2D-vector class whose objects accept messages ' x, 
1 y, and ' add for extracting components and vector addition. 

(define vec/c 

([msg : (one-of/c 'x 'y 'add)] 
-> (match msg 

[(or 'x 'y) real?] 

[ 1 add (vec/c -> vec/c) ] ) ) ) 

This definition demonstrates several powerful contract system fea- 
tures which we are able to handle: 

• contracts are first-class values, as in the definition of vec/c, 

• contracts may include arbitrary predicates, such as real?, 

• contracts may be recursive, as in the contract for ' add, 

• function contracts may express dependent relationships be- 
tween the domain and range — the contract of the result of 
method selection for vec/c depends on the method chosen. 

Suppose we want to define a mixin that takes any class that 
satisfies the vec/c interface and produces another class with added 
vector operations such as ' len for computing the vector's length. 
The extend function defines this mixin, and ext- vec/c specifies 
the new interface. We verify that extend violates no contracts and 
returns a class that respects specifications from ext-vec/c. 



142 



(extend : (real? real? -> vec/c) 

-> (real? real? -> ext-vec/c)) 
(define (extend mk-vec) 
(A (x y) 

(let ([vec (mk-vec x y)]) 
(A (m) 
(match m 
['len 

(let ([x (vec 'x)] [y (vec 'y)]) 
(sqrt (+ (* x x) (* y y))))] 
[_ (vec in)]))))) 
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(define ext-vec/c 

([msg : (one-of/c 'x 'y 'add 'len)] 
-• (match msg 

[(or 'x 'y) real?] 

[ ' add (vec/c -> vec/c) ] 

['len (and/c real? (>/c 0))]))) 

To verify extend, we provide an arbitrary value, which is guar- 
anteed by its contract to be a class matching vec/c. The mixin re- 
turns a new class whose objects understand messages 'x, 'y, 'add, 
and ' len. This new class defines method ' len and relies on the un- 
derlying class to respond to ' x, ' y, and ' add. Because the old class 
is constrained by contract vec/c, the new class will not violate its 
contract when responding to messages 'x, 'y, and 'add. 

For the ' len message, the object in the new vector class extracts 
its components as abstract numbers x and y, according to interface 
vec/c. It then computes their squares and leaves the following 
information on the heap: 

x 2 {real?. (=/c (* x x) )} 

y 2 i-> {real?, (=/c (* y y))} 
s h-> {real?, (=/c (+ x 2 y 2 ))} 

Solvers such as Z3 (De Moura and Bjorner 2008) can handle simple 
non-linear arithmetic and verify that the sum s is non-negative, 
thus the sqrt operation is safe. Execution proceeds to take the 
square root — now called I — and refines the heap with the following 
mapping: 

I i-> {real?, (=/c (sqrt s))} 

When the method returns, its result is checked by contract ext- 
vec/c to be a non-negative number. We again rely on the solver to 
prove that this is the case. 

Therefore, extend is guaranteed to produce a new class that is 
correct with respect to interface vec -ext/c, justifying the elimina- 
tion of expensive run-time checks. In a Racket program computing 
the length of 100000 random vectors, eliminating these contracts 
results in a 100-fold speed-up. While such dramatic results are un- 
likely in full programs, measurements of existing Racket programs 
suggests that 50% speedups are possible (Strickland et al. 2012). 

3. A Symbolic Language with Contracts 

In this section, we give a reduction system describing the core of 
our approach. Symbolic Ac is a model of a language with first-class 
contracts and symbolic values. We first present the semantics, in- 
cluding handling of primitives and unknown functions. We then 
describe how the handling of primitive values integrates with exter- 
nal solvers. Finally, we show an abstraction of our symbolic system 
to accelerate convergence. For each abstraction, we relate concrete 
and symbolic programs and prove a soundness theorem. 

At a high level, the key idea of our semantics is that abstract val- 
ues behave non-deterministically in all possible ways that concrete 
values might behave. Furthmore, abstract values can be bounded by 
specifications in the form of contracts that limit these behaviors. As 
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a result, an operational semantics for abstract values explores all the 
ways that the concrete program under consideration might be used. 

Given this operational semantics, we can then examine the re- 
sults of evaluation to see if any results are errors blaming the com- 
ponents we wish to verify. If they do not, then our soundness theo- 
rem implies that there are no ways for the component to be blamed, 
regardless of what other parts of the program do. Thus, we have 
verified the component against its contract in all contexts. We make 
this notion precise in section 3.6. 

3.1 Syntax of Symbolic Ac 

Our initial language models the functional core of many modern 
dynamic languages, extended with behavioral, first-class contracts, 
as well as symbolic values. The abstract syntax is shown in figure 1 . 
Syntax needed only for symbolic execution is highlighted in gray; 
we discuss it after the syntax of concrete programs. 

A program p is a sequence of module definitions followed by a 
top-level expression which may reference the modules. Each mod- 
ule m has a name / and exports a single value u with behavior 
enforced by contract u c . (Generalizing to multiple-export modules 
is straightforward.) 

Expressions include standard forms such as values, variable and 
module references, applications, and conditionals, as well as those 
for constructing and monitoring contracts. Contracts are first-class 
values and can be produced by arbitrary expressions. For clarity, 
when an expression plays the role of a contract, we use the metavari- 
able c and d, rather than e. A dependent function contract (c -» Xx.d) 
monitors a function's argument with c and its result with the con- 
tract produced by applying Xx.d to the argument. 

A contract violation at run-time causes blame, an error with 
information about who violated the contract. We write blame^;/ to 
mean module i is blamed for violating the contract from £". The 
form (mon^;, (c, e)) monitors expression e with contract c, with I 
being the positive party, I 1 the negative party, and I" the source of 
the contract. The system blames the positive party if e produces a 
value violating c, and the negative one if e is misused by the context 
of the contract check. To make context information available at run- 
time, we annotate references and applications with labels indicating 
the module they appear in, or \ for the top-level expression. For 
example, x^ denotes a reference to the name x from the top level, 
and (addl x) e denotes an addition inside module I. When a module 
i causes a primitive error, such as applying 5, we also write blame^, 
indicating that it violates a contract with the language. We omit 
labels when they are irrelevant or can be inferred. 
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Pre-values u — extended to values below — include abstractions, 
base values, pairs of values, and dependent contracts with domain 
components evaluated. Base values include numbers, booleans, and 
the empty list. Primitive operations overvalues are standard, includ- 
ing predicates o? for dynamic testing of data types. 

To reason about absent components, we equip Ac with unknown, 
or symbolic values, which abstract over multiple concrete values 
exhibiting a range of behavior. An unknown value • stands for 
any value in the language. For soundness, execution must account 
for all possible concretizations of abstract values, and reduction 
becomes non-deterministic. As the program proceeds through tests 
and contract checks, more assumptions can be made about abstract 
values. To remember these assumptions, we take the pre-values and 
refine each with a set of contracts it is known to satisfy, written 
u/v. 

Finally, to track refinements of unknown values, we use heap 
addresses a as symbolic values and track them in a heap, which is 
a finite map from addresses to refined pre-values: 



Heaps a ::= {a,u/~if). 

The heap o maps addresses allocated for unknown values to refine- 
ments expressed as contracts; these refinements are updated during 
reduction and represent upper bounds on what they might be at run- 
time. Intuitively, any possible concrete execution can be obtained 
by substituting addresses with concrete values within bounds spec- 
ified by the heap. We omit refinements when they are empty or 
irrelevant. 

3.2 Semantics of Symbolic Ac 

We now turn to the reduction semantics for Symbolic Ac, which 
combines standard rules for untyped languages with behavior for 
unknown values. Reduction is defined as a relation on states, pa- 
rameterized by a module context: 

rrt h ? i — > ?' 

States consist either of an expression paired with a heap, or blame: 

States ? ::= (e, a) | blame^. 

We present the rules inline; a full version of all rules is given 
in the appendix of the the accompanying technical report (Nguyen 
et al. 2014). In the inline presentation of rules, we systematically 
omit labels in contracts, these are presented in the full rules. We 
omit the module context whenever it is irrelevant. 

3.2.1 Basic rules 

Applications of primitives are interpreted by a 8 relation, which 
maps operations, arguments and heaps to results and new heaps. 

Apply-Primitive 
S(a, o!?, ) 3 ? 
( o it ) , a i — > ? 

The use of a 8 relation in reduction semantics is standard, but 
typically it is a function and is independent of the heap. We make 
8 dependent on the heap in order to use and update the current 
set of invariants; we make it a relation, since it may behave non- 
deterministically on unknown values. For example, in interpreting 
(> • 5), the 8 relation will produce two results: one true, with 
an updated heap to reflect the unknown value is ( >/c 5 ) ; the other 
false, with a heap reflecting the opposite. The 8 relation is thus 
the hub of the verification system and a point of interaction with 
the SMT solver. It is described in more detail in section 3.3. 



Applications of A-abstractions follow standard /3-reduction; ap- 
plications of non-functions result in blame. 

Apply-Function Apply-Non-Function 

8(a, proc?, v) 3 (false, a ) 
((Xx.e) v),o i — ► [v/x]e,a (vv'),a i — ► blame, a 

Notice that the 8 relation is employed to determine whether the 
value in operator position is a function using the proc? primitive. 
(Non-functions include concrete numbers and booleans as well as 
abstract values known to exclude functions; application of abstract 
values that may be functions is described in section 3.2.3.) 

Conditionals follow a common treatment in untyped languages 
in which values other than false are considered true. 
If-True If-False 
5(a, false?, v) 3 (false, a') 5(a, false?, v) 3 (true, a') 
if v ei e2, ex i — > e\,a' if v ei ei, a i — > e2,o' 

Just as in the case of Apply-Non-Function, the interpretation of 
conditionals uses the 8 relation to determine whether false? holds, 
which takes into account all of the knowledge accumulated in a 
and in either branch that is taken, updates the current knowledge to 
reflect whether false? of v holds. This is the mechanism by which 
control-flow based refinements are enabled. 

The two rules for module references reflect the approach in 
which contracts are treated as boundaries between components (Di- 
moulas et al. 2011): a module self-reference incurs no contract 
check, while cross-module references are protected by the specified 
contract. 

Module-Self-Reference Module-External-Reference 
(module / u c u) £ rrt (module / u c u) G rrt f 7^ £ 

rrt h f*, o i — >u,a r&\-f e ,ai — ► mon (u c , u) , a 

Finally, any state that is stuck with blame inside an evaluation 
context transitions to a final blame state that discards the surround- 
ing context and heap. 

Halt-Blame 



£ [blame], <j i — > blame 
Evaluation contexts as defined as follows: 

£::= []\£e\v£ \o^£~t | if fee 
mon(£ ,e) | mon(w,f ) | £ ->Xx.e 

3.2.2 Contract monitoring 

Contract monitoring follows existing operational semantics for con- 
tracts (Findler and Felleisen 2002), with extensions to handle and 
refine symbolic values. 

There are several cases for checking a value against a contract. 
If the contract is not a function contract, we say it is flat, denoting 
a first-order property to be checked immediately. We thus expand 
the checking expression to a conditional. 

Monitor-Flat-Contract 

8(a, dep?, v c ) 3 (false, o') o' h v : v c ? 

mon(u c , v) , a i — ► if [v c v) assume [v, v c ) blame, a 

Since contracts are first-class, they can also be abstract values; we 
rely on 8 to determine whether a value is a flat contract by using (the 
negation of) the predicate for dependent contracts, dep?, instead of 
examining the syntax. This rule is standard except for the use of 
assume(i>, v c ) and the (• h • : • ?) judgment. The assume [v, v c ) 
form, which would normally just be v, dynamically refines value 
v and the heap to indicate that v satisfies v c ; assume is discussed 
further in section 3.2.3. The judgment a' h v : v c ?, which 
would normally just be omitted, indicates that the contract v c cannot 
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be statically judged to either pass or fail for v, which is why the 
predicate must be applied. This judgment and its closely related 
counterparts (•!-•:•/) and (• h • : ■ X), which statically prove 
a value must or must not satisfy a given contract respectively, are 
discussed in section 3.4. If a flat contract can be statically proved 
or refuted, monitoring can be short-circuited. 



Monitor-Proved 
8(<j, dep?, v c ) 3 (false, a') 
a' h v : v c / 



mon(v c ,v),a i 



v, a 



Monitor-Refuted 
5(a, dep?, v c ) 3 (false, a') 
o' h v : v c X 

mon [v c , v),o i — > blame, a' 



Monitoring a function contract against a function is interpreted 
the standard jy-expansion of contracts. 

Monitor-Function-Contract 

<5(ct, proc?, «) 3 (true, a') 

mon [v c -> Ax.d, v) , a i — > Ax. mon (d, {v mon(v c , x) ) ) , a' 

Monitoring a function contract against a non-function results in 
an error. 
Monitor-Non-Function 

S(a, dep?, v c ) 3 (true,<7i) <j(ci, proc?, v) 3 (false, 0-2) 
mon(t; c ,t>),cr 1 — > blame, 1T2 

When a dependent contract is represented by a address in the 
heap, we look up the address and use the result. 

Monitor-Unknown-Function-Contract 

S(a, dep?, a) 3 (true, cri) 
8(a-i, proc?, v) 3 (true, 0-2) 02(a) = v c ->Ax.d 

mon (a, v), a 1 — > Ax.monfd, v mon(u c , x) ), 01 
3.2.3 Handling unknown values 

The final set of reduction rules concern unknown values and refine- 
ments. 

Refine-Concrete Refine-Unknown 

tijii a $l dom(a) 

u,o 1 — >u/$,o •, cr 1 — y a, a[a «/0] 

These two rules show reduction of pre-values, which initially have 
no refinement. If the pre-value is unknown, we additionally create 
a fresh address and add it to the heap. 

The assume form uses the refine metafunction to update the 
heap of refinements to take into account the new information; see 
figure 2 for the definition of refine. 

Assume 
(cr', v) — refine(a, v, v c ) 

assume^, v c ) , a 1 — > «', cr' 

Refinement is straightforward propagation of known contracts, in- 
cluding expanding values known to be pairs via c 0 n s ? into pair val- 
ues, and values known to be function contracts (via dep?) into func- 
tion contract values. 

Finally, we must handle application of unknown values. The first 
rule simply produces a new unknown value and heap address for the 
result of a call. If the unknown function came with a contract, this 
new unknown value will be refined by the contract via reduction. 

Apply-Unknown Havoc 
S(a, proc?, a) 3 (true, a') 8(a, proc?, a) 3 (true, cr') 



a v, a 1 — > a a ,a'[a a h-> •] 



a v, a 1 



havoc v, a 



The second reduction rule for applying an unknown function, 
labeled Havoc, handles the possible dynamic behavior of the un- 
known function. A value passed to the unknown function may it- 
self be a function with behavior, whose implementation we hope 
to verify. This function may further be invoked by the unknown 



refine(o, a,v) = (cr' [a v'] , a) 

where (o' ,v') = refine{a, a(a),v) 

refine(ff, •/v , cons?) = (cr[oi >->■•] [02 >->■•], (01,02)/!/ ) 

where 01, 02 ^ dom(a) 

refine(o, •/ v , dep?) = {o[a M> •], a->Ax.»/v) 

where a g dom(a) 

refiineio, u/lf ', Vi) = (a, u/lt U {«»}) 

Figure 2. Refinement for Symbolic Ac 



function on unknown arguments. To simulate this, we assume arbi- 
trary behavior from this unknown function and put the argument in 
a so-called demonic context implemented by the havoc operation, 
defined in a module added to every program; the definition is given 
below. 

(module havoc (any-> A_.false) 

(Ax.amb({ (havoc (x •)), (havoc (car a;)), (havoc (cdr x))}))) 

amb({e}) = e 

amb({e, ei, . . . }) = if • e amb({e\, . . . }) 

The havoc function never produces useful results; its only purpose 
is to probe for all potential errors in the value provided. This con- 
text, and thus the havoc module, may be blamed for misuse of ac- 
cessors and applications; we ignore these, as they represent poten- 
tial failures in omitted portions of the program. Using havoc is key 
to soundness in modular higher-order static checking (Fahndrich 
and Logozzo 201 1; Tobin-Hochstadt and Van Horn 2012); we dis- 
cuss its role further in section 3.6. Intuitively, precise execution of 
properly contracted functions prevents havoc from destroying ev- 
ery analysis. 

3.3 Primitive operations 

Primitive operations are the primary place where unknown values 
in the heap are refined, in concert with successful contract checks. 
Figure 3 shows a representative excerpt of 8's definition; the full 
definition is given in the accompanying technical report. 

The first three rules cover primitive predicate checks. Ambi- 
guity never occurs for concrete values, and an abstract value may 
definitely prove or refute the predicate if the available information 
is enough for the conclusion. If the proof system cannot decide a 
definite result for the predicate check, S conservatively includes 
both answers in the possible results and records assumptions cho- 
sen for each non-deterministic branch in the appropriate heap. The 
last three rules reveal possible refinements when applying partial 
functions such as addl, which fails when given non-numeric in- 
puts. This mechanism, when combined with the SMT-aided proof 
system given below, is sufficient to provide the precision necessary 
to prove the absence of contract errors. 

3.4 SMT-aided proof system 

Contract checking and primitive operations rely on a proof system 
to statically relate values and contracts. We write ah«:t c / to 
mean value v satisfies contract v c , where all addresses in v are 
defined in a. In other words, under any possible instantiation of the 
unknown values in v, it would satisfy v c when checked according 
to the semantics. On the other hand, o h v : v c X indicates that v 
definitely fails v c . Finally, a h v : v c ? is a conservative answer 
when information from the heap and refinement set is insufficient 
to draw a definite conclusion. The effectiveness of our analysis 
depends on the precision of this provability relation — increasing the 
number of contracts that can be related statically to values prunes 
spurious paths and eliminates impossible error cases. 
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S(a,ov,v) 3 (true, a) if a h v : o? / 

8(a,o?,v) 3 (false, a) ifcrl-w:o?X 

5(a, o-r,a) D {(true, at), (false, 07)} 

if a h a : o? ? and (o"t,_) = refine(o, a, o?) 

and (07, _) = refine(a, a, -10?) 

(5(<t, addl, n) 9 (71 + 1,(7) 

<5(<t, addl,«) 9 (a, er'[a M> »/num?]) 

where 5(a, num?, i>) 9 (true, cr'), v ^ n, and a ^ a' 

S(a, addl,w) 9 (blame A , a') 

where <5(cr, num?, «) 9 (false, a') 



Figure 3. Selected primitive operations 



3.4.1 Simple proof system 

A simple proof system can be obtained which returns definite an- 
swers for concrete values, uses heap refinements, and handles nega- 
tion of predicates and disjointness of data types. 

a h n : num? / 

a h n : o? X if o? G {cons?,proc?,etc.} 

a h u/lt : Vi / if Vi G it 

a h u/lt : o? X if ^o? G it 

a \- a : v / if cr h a(a) : w / 

a h a : vX if cr h cr(a) : «/ 

a h a : v? ifcrh cr(a) : t> ? 



cr h 0 : w c ? 



(conservative default) 



Notice that the proof system only needs to handle a small number 
of well-understood contracts. We rely on evaluation to naturally 
break down complex contracts into smaller ones and take care of 
subtle issues such as divergence and crashing. By the time we 
have w/lt, we can assume all contracts in it have terminated with 
success on u. With these simple and obvious rules, our system can 
already verify a significant number of interesting programs. With 
SMT solver integration, as described below, we can handle far more 
interesting constraints, including relations between numeric values, 
without requiring an encoding of the full language. 

3.4.2 Integrating an SMT solver 

We extend the simple provability relation by employing an external 
solver. 

We first define the translation §-}}s from heaps and contract- 
value pairs into formulas in solver S: 



l(^)}s = A la : c}}s 



gai : (>/c n)Jg = ASSERT cii > n 
{{ai : (>/c CI2) Js = ASSERT cii > CI2 
{{a: (=/c (+ cii <22))j}s = ASSERT a = cii + at 

The translation of a heap is the conjunction of all formulas generated 
from translatable refinements. The function is partial, and there are 
straightforward rules for translating specific pairs of (a : c) where 
c are drawn from a small set of simple, well-understood contracts. 
This mechanism is enough for the system to verify many interesting 
programs because the analysis relies on evaluation to break down 
complex, higher-order predicates. Not having a translation for some 
contract c only reduces precision and does not affect soundness. 



Next, the extension (hs) is straightforward. The old relation 
(h) is refined by a solver S. Whenever the basic relation proves 
a h v : c ?, we call out to the solver to try to either prove or refute 
the claim: 

{{°"}}s A : c Js is unsat fi^Js A {{« : cjs is unsat 



The solver-aided relation uses refinements available on the heap to 
generate premises {{crjs. Unsatisfiability of gafs A^gu : c$s is 
equivalent to validity of {{a}}s => : cj 5, hence value definitely 
satisfies contract c. Likewise, unsatisfiability of §cr]}s A : cjs 
means v definitely refutes c. In any other case, we relate the value- 
contract pair to the conservative answer. 

3.5 Program evaluation 

We give a reachable-states semantics to programs: the initial pro- 
gram p is paired with an empty heap, and e val produces all states in 
the reflexive, transitive closure of the single-step reduction relation 
closed under evaluation contexts. 

eval : p —¥ "P(?) 

eval(nte) = {? | rK h (e' ;e),0 > — » ?} 

where e' = amt>({true, havoc/}), (module f v c v) G rft 

Modules with unknown definitions, which we call opaque, com- 
plicate the definition of eval, since they may contain references to 
concrete modules. If only the main module is considered, an opaque 
module might misuse a concrete value in ways not visible to the 
system. We therefore apply havoc to each concrete module before 
evaluating the main expression. 

3.6 Soundness 

A program with unknown components is an abstraction of a fully- 
known program. Thus, the semantics of the abstracted program 
should approximate the semantics of any such concrete version. In 
particular, any behavior the concrete program exhibits should also 
be exhibited by the abstract approximation of that program. 

However, we must be precise as to which behaviors are rele- 
vant. Suppose we have a single concrete module that links against 
a single opaque module. The semantics of this program should in- 
clude all of the possible behaviors, both good and bad, of the known 
module assuming the opaque module always lives up to its contract. 
We exclude from consideration behaviors that cause the unknown 
module to be blamed, since it is of course impossible to verify an 
unknown program. In other words, we try to verify the parts of the 
program that are known, assuming arbitrary, but correct, behavior 
for the parts of the program that are unknown. 

For this reason, the precise semantic account of blame is crucial. 
The demonic havoc context can introduce blame of both the known 
and unknown modules; since we can distinguish these parties, it is 
easy to ignore blame of the unknown context. 

In the remainder of this section, we formally define the approxi- 
mation relation and show that evaluation preserves the approxima- 
tion, i.e. if program q is an approximation of program p (q is like p 
but with potentially more unknowns), then the evaluation of q is an 
approximation of the evaluation of p. 

Approximation: We define two approximation relations: between 
modules and between pairs of expressions and heaps. 

We write <; C to mean approximates ?," or "? refines 
which intuitively means ? ' stands for a set of states including For 
example, (1, {}) C (a, {a \-¥ •}). 

One complication introduced by addresses is that a single ad- 
dress in the abstract program may accidentally approximate multi- 
ple distinct values in the concrete one. Such accidental approxima- 
tions are not in general preserved by reduction, as in the following 
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(u/~V,<Ti) E F (^2(a), 0 2 ) (<Tl(oi),fTl) C F (<T2(a2), CT2) 

F(o) = u/lt F(a 2 ) = ai 

(u/v,ai) C F (0,02) (oi,<7i) E F (a2,cr 2 ) 

(ui/«i,ci) C F (w 2 /w2),o- 2 ) («c,oi) C F (fd,a- 2 ) 
(ui/vi U {w c },o-i) C F (w 2 /«2 U {«d},o- 2 ) 

(tti/wi,ffi) E F {.U2/V2, 02) 
(u, 01) C F (•, 02) {ui/vi U w, 01) C F (U2/V2, (7 2 ) 

(module / w c • ) G m or / G {t, havoc} 
(blame£,ai) (e,<r 2 ) 



Figure 4. Selected Approximation Rules 

example where (ei,oi) C (c 2 ,cr2): 

ei = (if false 12) 01 = {} 

e 2 = (if a a a) a 2 = {a i-> •} 

The abstract program does not continue to approximate the concrete 
one in their next states: 

ci .— ► (2,ffi) o( = {} 

e 2 1 — > (a, o 2 ) o 2 = {a i-» f alse} 

We therefore also define a "strong" version of the approximation 
relation, C F , where each address in the abstract program approx- 
imates exactly one value in the concrete program, and this consis- 
tency is witnessed by some function F from addresses to values. 
Then e jZ e' means that 3F.e C F e' Since no such function ex- 
ists between ei and e 2 above, ei g F e 2 for any F, and therefore 
ei g e 2 . 

Figure 4 shows the important cases in the definition of C F ; we 
omit structurally recursive rules. All pre-values are approximated 
by •, and unknown values with contracts approximate values that 
satisfy the same contracts. We extend the relation structurally 
to evaluation contexts £ , point-wise to sequences, and to sets of 
program states. 

In the following example, (ei,CTi) C F (e 2 ,<T 2 ), where F = 
{do <-¥ false,ai i-> l,a 2 i-> 2}: 

ei = (if false 12) 01 = {} 

e 2 = (if ai a 2 03) o 2 — {a\ i->- «,a 2 i-> •, a 3 h-> •} 

Notice that F's domain is a superset of the domain of the heap 
o~ 2 • In addition, our soundness result does not consider additional 
errors that blame unknown modules or the havoc module, and there- 
fore we parameterize the approximation relation with the mod- 
ule definitions m to select the opaque modules. We omit these pa- 
rameters where they are easily inferred to ease notation. 

With the definition of approximation in hand, we are now in a 
position to state the main soundness theorem for the system. 

Theorem 1 (Soundness of Symbolic Ac). 

If P Est 1 where q = Trie and ? G eval(p), then there exists some 
G eval(q) such that ? 

We defer all proofs to the technical report for space. 
3.7 Verification and the blame theorem 

We can now define verification as a simple corollary of soundness. 
First we defined when a module is verified by our approach. 



Definition 1 (Verified module). 

A module f module / u c u) G p is verified in p if u / • and 
eval(p) $ blamed 

Now, by soundness, / is always safe. 

Theorem 2 (Verified modules can't be blamed). 

If a module named f is verified in p, then for any concrete program 

qfor which p is an abstraction, eval(q) j$ blamed 

3.8 Taming the infinite state space 

A naive implementation of the above semantics will diverge for 
many programs. Consider the following example: 

(define (fact n) 

(if (= n 0) 1 (* n (fact (- n 1))))) 
(fact •) 

Ignoring error cases, it eventually reduces non-deterministically to 
all of the following: 

1 if a n M> 0 
(* a n 1) if a n hA 0, a n _i i-> 0 
(* a„ (* a n _i (fact a„_i) ) ) if a„, o„_i 0 

where a n _i is a fresh address resulting from subtracting a„ by one. 
The process continues with a n _ 2 , a„_3, etc. This behavior from the 
analysis happens because it attempts to approximate all possible 
concrete substitutions to abstract values. Although fact terminates 
for all concrete naturals, there are an infinite number of those: a„ 
can be 0, 1, 2, and so on. 

To enforce termination for all programs, we can resort to 
well-known techniques such as finite state or pushdown abstrac- 
tions (Van Horn and Might 2012). But often those are overkill at 
the cost of precision. Consider the following program: 

(let* ([id (A (x) x)] [y (id 0)] [z (id 1)]) 
(< y z)) 

where a monovariant flow analysis such as OCFA (Shivers 1988) 
thinks y and z can be both 0 and 1, and pushdown analysis thinks y 
is 0 and z is either 0 or 1. For a concrete, straight-line program, such 
imprecision seems unsatisfactory. We therefore aim for an analysis 
that provides exact execution for non-recursive programs and re- 
tains enough invariants to verify interesting properties of recursive 
ones. The analysis quickly terminates for a majority of program- 
ming patterns with decent precision, although it is not guaranteed 
to terminate in the general case — see section 4 for empirical results. 

One technical difficulty is that the semantics of contracts pre- 
vents us from using a recursive function' s contract directly as a loop 
invariant, because contracts are only boundary-level enforcement. 
It is unsound to assume returned values of internal calls can be ap- 
proximated by contracts, as in f below: 

(f : nat? - nat?) 

(define (f n) (if (= n 0) "" (str-len (f (- n 1))))) 

If we assume the expression ( f ( - n 1 ) ) returns a number as 
specified in the contract, we will conclude f never returns, and is 
blamed either for violating its own contract by returning a string, 
or for applying str-len to a number. However, f returns 0 when 
applied to 1. To soundly and precisely approximate this semantics in 
the absence of types, we recover data type invariants by execution. 

Summarizing function results: To accelerate convergence, we 
modify the application rules as follows. At each application, we de- 
cide whether execution should step to the function's body or wait 
for known results from other branches. When an application ( f v ) 
reduces to a similar application, we plug in known results instead 
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Figure 5. Summarizing Semantics 



Expressions e += ( rt( CTj „ jt ,) e) | (blur( FjCTjl> ) e) 

Values v += ^x.~¥ \ \x 

Evaluation contexts £ += ( rt^ a v v ) £) \ (blur^ FjCT „) £) 



Context memo tables 3 ::= ((a, v, v), (F, a, £, £)) 

Value memo tables M ::= ((a, v, v), (v, a]) 
Renamings F ::= (a,a) 

Figure 6. Syntax extensions for approximation 



of executing f 's body again, avoiding the infinite loop. Correspond- 
ingly, when ( f v ) returns, we plug the new-found answer into con- 
texts that need the result of (f v ). The execution continues until it 
has a set soundly describing the results of ( f v ) . 

To track information about application results and waiting con- 
texts, we augment the execution with two global tables M and 3 
as shown in figure 6. We borrow the choice of metavariable names 
from work on concrete summaries (Johnson and Van Horn 2014). 

A value memo table M maps each application to known re- 
sults and accompanying refinements . Intuitively, if M(a, Vf , v x ) 9 
(v, a') then in some execution branch, there is an application 
{Vf v x ),a i — » (v,a'). 

A context memo table 3 maps each application to contexts wait- 
ing for its result. Intuitively, 3(cr, Vf, v x ) 3 (F, a',£ 1 ,£ k ) means 
during evaluation, some expression £ i [( rt { a ,v,,v x > [£fc [ ( Vf v z )]] ) ] 
with heap a' is paused because applying ( Vf v z ) under assumptions 
in a 1 is subsumed by applying {vf v x ) under assumptions in a up 
to consistent address renaming specified by function F. 

To keep track of function applications seen so far, we extend 
the language with the expression ( rt( CT „ „/) e), which marks e as 
being evaluated as the result of applying v to v', but otherwise 
behaves like e. The expression (blur{ FjCT> „) e), whose detailed 
role is discussed below, approximates e under guidance from a 
"previous" value v. 

Finally, we add recursive contracts fix.v and recursive refer- 
ences \x for approximating inductive sets of values. For example, 
fix. {empty, (•/nat?,!a:)} approximates all finite lists of naturals. 



A state in the approximating semantics with summarization con- 
sists of global tables 3, M, and a set S of explored states c. 

Reduction now relates tables 3, M, and a set of states ~^ to 
new tables 3', M' and a new set of states We define a relation 
(3, M, <;) i — y (3, M, and then lift this relation point-wise to 
sets of states. Figure 5 only shows rules that use the global tables or 
new expression forms. 

In the first rule, if an application ( {Xx.e) v) is not previously 
seen, execution proceeds as usual, evaluating expression e with x 
bound to v, but marking this expression using rt. 

Second, if a previous application of ( ( Xx.e) Vq ) results in appli- 
cation of the same function to a new argument v, we approximate 
the new argument before continuing. Taking advantage of knowl- 
edge of the previous argument, we guess the transition from the vo 
to v and heuristically emulate an arbitrary amount of such trans- 
formation using the © operator. For example, if vo is empty and v 
is (l, empty), we approximate the latter to fix. {empty, (l,!a;)}, de- 
noting a list of l's. If a different number is later prepended to the 
list, it is approximated to a list of numbers. The © operator should 
work well in common cases and not hinder convergence in the gen- 
eral case. Failure to give a good approximation to a value results in 
non-termination but does not affect soundness. 

Third, when an application results in a similar one with poten- 
tially refined arguments, we avoid stepping into the function body 
and use known results from table M instead. In addition, we re- 
fine the current heap to make better use of assumptions about the 
particular "base case". We also remember the current context as one 
waiting for the result of such application. To speed up convergence, 
apart from feeding a new answer v a to the context, we wrap the en- 
tire expression inside (blur <FiCT I ,) [ ]) to approximate the future 
result. 

The fourth rule in figure 5 shows reduction for returning from 
an application. Apart from the current context, the value is also 
returned to any known context waiting on the same application. 
Besides, the value is also remembered in table M. The resumption 
and refinement are analogous to the previous rule. 

Finally, expression (blur/jr CTi „ 0 \ v) approximates value v un- 
der guidance from the previous value vo and also approximates val- 
ues on the heap from observation of the previous case. Overall, the 
approximating operator © occurs in three places: arguments of re- 
cursive applications, result of recursive applications, and abstract 



148 



values on the heap when recursive applications return. Empirical 
results for our tool are presented in section 4. 

Soundness of summarization: A system (S, M, S) approximates 
a state ? if that state can be recovered from the system through 
approximation rules. The crucial rule, given below, states that if 
the system (3, M, S) already approximates expression e and the 
application {vf v x ) is known to reduce to e, then (3, M, S) is 
an approximation of £ k [e] where Ek is a waiting context for this 
application. 

(rt< O [])$£o 
{v x ,cr)\Z(v z ,cr') (v v , a) C (v z , a) 
a(a',v,v z ) 3 (F,a',£o,£' k ) {£o,cr) E (£o,°) 
(gfc.g) E (gfcX) (go[(rt< gl , P , P ,> e)],g) E (5,M,S) 

(£Tq ,V,V X ) {<Jl ,V,Vy ) 

e)])],a) E (E.M.S) 

As a consequence, summarization properly handles repetition of 
waiting contexts, and gives results that approximate any number 
of recursive applications. We refer readers to the appendix of the 
accompanying technical report for the full definition of the approx- 
imation relation. 

With this definition in hand, we can state the central lemma to 
establish the soundness of the revised semantics that uses summa- 
rization. 

Lemma 1 (Soundness of summarization). 

Ifs E (3, M, S) and ? i— ► then (3, M, S) i— » (=', M', S') 
such that ?' C (H',M',S"). 

The proof is given in the accompanying technical report. With 
this lemma in place, it is straightforward to replay the proof of the 
soundness and blame theorems. 

4. Implementation and evaluation 

To validate our approach, we implemented a static contract check- 
ing tool, SCV, based on the semantics presented in section 3, along 
with a number of implementation extensions for increased preci- 
sion and performance. We then applied SCV to a wide selection of 
programs drawn from the literature on verification of higher-order 
programs, and report on the results. 

The source code for SCV and all benchmarks are available along 
with instructions on reproducing the results we report here: 

git hub . com/philnguyen/sof t- contract 

In order to quantify the importance of the techniques presented 
in this paper, we also created a simpler tool which omits the key 
contributions of this work. This slimmed down system, which we 
refer to as "Simple" below, (a) does not call out to a solver, but 
relies on remembering seen contracts, (b) never refines the contracts 
associated with a heap address, but splits disjunctive contracts and 
unrolls recursive contracts, and (c) does not use our technique for 
summarizing repeated context. To enable a full comparison on all 
benchmarks, the Simple tool supports first-class contracts. This 
simpler system is extremely similar to that presented by our earlier 
work (Tobin-Hochstadt and Van Horn 2012), but works on all of 
our benchmarks. 

Implementation extensions: SCV supports an extended language 
beyond that presented in section 3 in order to handle realistic pro- 
grams. First, more base values and primitive operations are sup- 
ported, such as strings and symbols (and their operations), although 
we do not yet use a solver to reason about values other than inte- 
gers. Second, data structure definitions are allowed at the top-level. 
Each new data definition induces a corresponding (automatic) ex- 
tension to the definition of havoc to deal with the new class of data. 



Third, modules have multiple named exports, to handle the exam- 
ples presented in section 2, and can include local, non-exported, def- 
initions. Fourth, functions can accept multiple arguments and can be 
defined to have variable-arity, as with +, which accepts arbitrarily 
many arguments. This introduces new possibilities of errors from 
arity mismatches. Fifth, a much more expressive contract language 
is implemented with and/c, or/c, struct/c, fj,/c for conjunctive, 
disjunctive, data type, and recursive contracts, respectively. Sixth, 
we provide solver back-ends for both CVC4 (Barrett et al. 201 1) 
and Z3 (De Moura and Bjorner 2008). 

Evaluating on existing benchmarks: To evaluate the applica- 
bility of SCV to a wide variety of challenging higher-order con- 
tract checking problems, we collect examples from the follow- 
ing sources: programs that make use of control-flow-based typing 
from work on occurrence typing (Tobin-Hochstadt and Felleisen 
2010), programs from work on soft typing, which uses flow anal- 
ysis to check the preconditions of operations (Cartwright and Fa- 
gan 1991), programs with sophisticated specifications from work 
on model checking higher-order recursion schemes (Kobayashi 
et al. 2011), programs from work on inference of dependent re- 
finement types (Terauchi 2010), and programs with rich contracts 
from our prior work on higher-order symbolic execution (Tobin- 
Hochstadt and Van Horn 2012). We also evaluate SCV on three 
interactive student video games built for a first-year programming 
course: Snake, Tetris, and Zombie. These programs were all orig- 
inally written as sample solutions, following the style expected of 
students in the course. Of these, Zombie is the most interesting: 
it was originally an object-oriented program, translated using the 
encoding seen in section 2.5. 

We present our results in summary form in table 1, grouping 
each of the above sets of benchmark programs; expanded forms 
of the tables are provided in the accompanying technical report. 
The table shows total line count (excluding blank lines and com- 
ments) and the number of static occurrences of contracts and primi- 
tives requiring dynamic checks such as function applications and 
primitive operations. These checks can be eliminated if we can 
show that they never fail; this has proven to produce significant 
speedups in practice, even without eliminating more expensive con- 
tract checks (Tobin-Hochstadt et al. 201 1). 

The table reports time (in milliseconds) and the number of false 
positives for SCV and our reduced system omitting the key contri- 
butions of this work (labeled "Simple"); "oo" indicates a timeout 
after 5 minutes. 

A false positive is a contract violation reported by the analysis, 
but by human inspection, cannot happen. The programs we consider 
are all known not to have contract errors, and thus all potential errors 
are false positives. 

In cases where a tool times out, we give an upper bound on 
the number of false positive error reports. For example, the Sim- 
ple system times out on two of the higher-order recursion scheme 
programs, meaning that if it were to complete, it would report at 
most 94 false positives, counting all contract checks from the two 
programs on which it times out, and the measured false positives on 
the programs where it completes. 

Execution times are measured on a Core i7 2.7GHz laptop with 
8GB of RAM. 

Discussion: First, SCV works on a benchmarks for a range of 
previous static analyzers, from type systems to model checking to 
program analysis. 

Second, most programs are analyzed in a reasonable amount of 
time; the longest remaining analysis time is under 30 seconds. This 
demonstrates that although the termination acceleration method of 
section 3 .8 is not fully general, it is effective for many programming 
patterns. For example, SCV terminates with good precision on last 
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Table 1. Summary benchmark results. (See the accompanying technical report for detailed results.) 



from Wright and Cartwright (1997), which hides recursion behind 
the Y combinator. 

Third, across all benchmarks, over 99% (2329/2335) of the con- 
tract checks are statically verified, enabling the elimination both of 
small checks for primitive operations and expensive contracts; see 
below for timing results. This result emphasizes the value of static 
contract checking: gaining confidence about correctness from ex- 
pensive contracts without actually incurring their cost. 

Overall, our experiments show that our approach is able to dis- 
cover and use invariants implied by conditional flows of control 
and contract checks. Obfuscations such as multiple layers of ab- 
stractions or complex chains of aliases do not impact precision (a 
common shortcoming of flow analysis). 

Our approach does not yet give a way to prove deep structural 
properties expressed as dependent contracts such as "map over a 
list preserves the length" or "all elements in the result of filter 
satisfy the predicate", resulting in the false positives seen in table 1. 
However, it can already be used to verify many interesting programs 
because often safety questions depend only on knowledge of top- 
level constructors. Examples of these patterns appear in programs 
from Kobayashi et al. (2011) for programs such as reverse (see 
also §2.4), nil, and mem. 

Finally, soft contract verification is more broadly applicable 
than the systems from which our benchmarks are drawn, which 
typically are successful only on their own benchmarks. For exam- 
ple, type systems such as occurrence typing (Tobin-Hochstadt and 
Felleisen 2010) cannot verify any non-trivial contracts, and most 
soft typing systems do not consider contracts at all. Systems based 
on higher-order model-checking (Kobayashi et al. 2011), and de- 
pendent refinement types (Terauchi 20 1 0) assume a typed language; 
encoding our programs using large disjoint unions produces unver- 
ifiable results. 

This broad applicability is why we are not able to directly com- 
pare SCV to these other systems across all benchmarks. Instead, the 
Simple system serves as a benchmark for a system which does not 
contain our primary contributions. 

Contract optimization: We also report speedup results for the 
three most complex programs in our evaluation, which are inter- 
active games designed for first-year programming courses (Snake, 
Tetris, and Zombie). For each, we recorded a trace of input and 
timer events while playing the game, and then used that trace to re- 
run the game (omitting all graphical rendering) both with the con- 
tracts that we verified, and with the contracts manually removed. 
Each game was run 100 times in both modes; the total time is pre- 
sented below. 

Program | Contracts On (ms) | Contracts Off (ms) 
snake 475,799 59~ 

tetris 1,127,591 186 

zombie 12,413 1,721 



The timing results are quite striking — speedup ranges from over 
5x to over 5000x. This does not indicate, of course, that speedups 
of these magnitudes are achievable for real programs. Instead, it 
shows that programmers avoid the rich contracts we are able to 
verify, because of their unacceptable performance overhead. Soft 
contract verification therefore enables programmers to write these 
specifications without the run-time cost. 

The difference in timing between Zombie and the other two 
games is intriguing because Zombie uses higher-order dependent 
contracts extensively, along the lines of vec/c from section 2.5, 
which intuitively should be more expensive. An investigation re- 
veals that most of the cost comes from monitoring flat contracts, es- 
pecially those that apply to data structures. For example, in Snake, 
disabling posn/c, a simple contract that checks for a posn struct 
with two numeric fields, cuts the run-time by a factor of 4. This 
contract is repeatedly applied to every such object in the game. In 
contrast, higher-order contracts, as in the object encodings used in 
Zombie, delay contracts and avoid this repeated checking. 

5. Related work 

In this section, we relate our work to four related strands of research: 
soft-typing, static contract verification, refinement types, and model 
checking of recursion schemes. 

Soft typing: Verifying the preconditions of primitive operations 
can be seen as a weak form of contract verification and soft typing 
is a well studied approach to this kind of verification (Cartwright 
and Felleisen 1996). There are two predominant approaches to 
soft-typing: one is based on a generalization of Hindley-Milner 
type inference (Cartwright and Fagan 1 99 1 ; Wright and Cartwright 
1997; Aiken et al. 1994), which views an untyped program as be- 
ing embedded in a typed one and attempts to safely eliminate co- 
ercions (Henglein 1994). The other is founded on set-based ab- 
stract interpretation of programs (Flanagan et al. 1996; Flanagan 
and Felleisen 1999). Both approaches have proved effective for 
statically checking preconditions of primitive operations, but the 
approach does not scale to checking pre- and post-conditions of ar- 
bitrary contracts. For example, Soft Scheme (Cartwright and Fagan 
1991) is not path-sensitive and does not reason about arithmetic, 
thus it is unable to verify many of the occurrence-typing or higher- 
order recursion scheme examples considered in the evaluation. 

Contract verification: Following in the set-based analysis tradi- 
tion of soft-typing, there has been work extending set-based anal- 
ysis to languages with contracts (Meunier et al. 2006). This work 
shares the overarching goal of this paper: to develop a static contract 
checking approach for components written in untyped languages 
with contracts. However the work fails to capture the control-flow- 
based type reasoning essential to analyzing untyped programs and is 
unsound (as discussed by Tobin-Hochstadt and Van Horn (2012)). 
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Moreover, the set-based formulation is complex and difficult to ex- 
tend to features considered here. 

Our prior work (Tobin-Hochstadt and Van Horn 2012), as dis- 
cussed in the introduction, also performs soft contract verification, 
but with far less sophistication and success. As our empirical re- 
sults show, the contributions of this paper are required to tackle the 
arithmetic relations, flow-sensitive reasoning, and complex recur- 
sion found in our benchmarks. 

An alternative approach has been applied to checking contracts 
in Haskell and OCaml (Xu 2012; Xu et al. 2009), which is to in- 
line monitors into a program following a transformation by Findler 
and Felleisen (2002) and then simplify the program, either using the 
compiler, or a specialized symbolic engine equipped with an SMT 
solver. The approach would be applicable to untyped languages ex- 
cept for the final step dubbed logicization, a type-based transforma- 
tion of program expressions into first-order logic (FOL). A related 
approach used for Haskell is to use a denotational semantics that 
can be mapped into FOL, which is then model checked (Vytinio- 
tis et al. 2013), but this approach is highly dependent on the type 
structure of a program. Further, these approaches assume a differ- 
ent semantics for contract checking that monitors recursive calls. 
This allows the use of contracts as inductive hypotheses in recur- 
sive calls. In contrast, our approach can naturally take advantage 
of this stricter semantics of contract checking and type systems, but 
can also accommodate the more common and flexible checking pol- 
icy. Additionally, our approach does not rely on type information, 
the lack of which makes these approaches inapplicable to many of 
our benchmarks. 

Contract verification in the setting of typed, first-order contracts 
is much more mature. A prominent example is the work on verifying 
C# contracts as part of the Code Contracts project (Fahndrich and 
Logozzo 201 1). 

Refinement type checking: Refinement types are an alternative 
approach to statically verifying pre- and post-conditions in a higher- 
order functional language. There are several approaches to checking 
type refinements; one is to restrict the computational power of re- 
finements so that checking is decidable at type-checking time (Free- 
man and Pfenning 1991); another is allow unrestricted refinements 
as in contracts, but to use a solver to attempt to discharge refine- 
ments (Knowles and Flanagan 2010; Rondon et al. 2008; Vazou 
et al. 2013). In the latter approach, when a refinement cannot be 
discharged, some systems opt to reject the program (Rondon et al. 
2008; Vazou et al. 20 1 3), while others such as hybrid type-checking 
residualize a run-time check to enforce the refinement (Knowles 
and Flanagan 2010), similar to the way soft-typing residualizes 
primitive pre-condition checks. The end result of our approach most 
closely resembles that of hybrid checking, although the technique 
applies regardless of the type discipline and approaches the problem 
using different tools. 

DJS (Chugh et al. 2012b,a) supports expressive refinement spec- 
ification and verification for stateful JavaScript programs, includ- 
ing sophisticated dependent specifications which SCV cannot ver- 
ify. However, most dependent properties require heavy annotations. 
Moreover, null inhabits every object type. Thus the approach can- 
not give the same guarantees about programs such as reverse 
(§2.4) without significantly more annotation burden. Additionally, 
it relies on whole program annotation, type-checking, and analysis. 

Model checking higher-order recursion schemes: Much of the 
recent work on model checking of higher-order programs relies on 
the decidability of model checking trees generated by higher-order 
recursion schemes (HORS) (Ong 2006). A HORS is essentially a 
program in the simply-typed A-calculus with recursion and finitely 
inhabited base types that generates (potentially infinite) trees. Pro- 
gram verification is accomplished by compiling a program to a 



HORS in which the generated tree represents program event se- 
quences (Kobayashi 2009b; Kobayashi et al. 2010). This method is 
sound and complete for the simply typed A-calculus with recursion 
and finite base types, but the gap between this language and real- 
istic languages is significant. Subsequently, an untyped variant of 
HORS has been developed (Tsukada and Kobayashi 2010), which 
has applications to languages with more advanced type systems, 
but despite the name it does not lead to a model checking proce- 
dure for the untyped A-calculus. A subclass of untyped HORS is 
the class of recursively typed recursion schemes, which has appli- 
cations to typed object-oriented programs (Kobayashi and Igarashi 
2013). In this setting, model checking is undecidable, but relatively 
complete with a certain recursive intersection type system (anything 
typable in this system can be verified). To cope with infinite data 
domains such as integers, counter-example guided abstraction re- 
finement (CEGAR) techniques have been developed (Kobayashi 
et al. 201 1). The complexity of model checking even for the simply 
typed case is n-EXPTIME hard (where n is the rank of the recur- 
sion scheme), but progress on decision procedures (Kobayashi and 
Ong 2009; Kobayashi 2009a) has lead to verification engines that 
can verify a number of "small but tricky higher-order functional 
programs in less than a second." 

In comparison, the HORS approach can verify some specifica- 
tions which SCV cannot, but in a simpler (typed) setting, whereas 
our lightweight method applies to richer languages. Our approach 
handles untyped higher-order programs with sophisticated language 
features and infinite data domains. Higher-order program invariants 
may be stated as behavioral contracts, while the HORS-based sys- 
tems only support assertions on first order data. Our work is also 
able to verify programs with unknown external functions, not just 
unknown integer values, which is important for modular program 
verification, and we are able to verify many of the small but tricky 
programs considered in the HORS work. 



6. Conclusions and perspective 

We have presented a lightweight method and prototype implemem- 
tation for static contract checking using a non-standard reduction se- 
mantics that is capable of verifying higher-order modular programs 
with arbitrarily omitted components. Our tool, SCV, scales to real- 
istic language features such as recursive data structures and mod- 
ular programs, and verifies programs written in the idiomatic style 
of dynamic languages. The analysis proves the absence of run-time 
errors without excessive reliance on programmer help. With zero 
annotation, SCV already helps programmers find unjustified usage 
of partial functions with high precision and could even be modified 
to suggest inputs that break the program. With explicit contracts, 
programmers can enforce rich specifications to their programs and 
have those optimized away without incurring the significant run- 
time overhead entailed by dynamic enforcement. 

While in this paper, we have addressed the problem of soft con- 
tract verification, the technical tools we have introduced apply be- 
yond this application. For example, a run of SCV can be seen as a 
modular program analysis — it soundly predicts which functions are 
called at any call site. Moreover it can be composed with whole- 
program analysis techniques to derive modular analyses (Van Horn 
and Might 2010). A small modification to blur to cause it to pick 
a small set of concrete values would turn our system into a con- 
colic execution engine (Larson and Austin 2003). Adding temporal 
contracts (Disney et al. 20 1 1 ) to our system would produce a model 
checker for higher-order languages. This breadth of application fol- 
lows directly from the semantics-based nature of our approach. 
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